emoji prediction
Understanding Textual Emotion Through Emoji Prediction
Gordon, Ethan, Kuppa, Nishank, Tummala, Rigved, Anasuri, Sriram
This project explores emoji prediction from short text sequences using four deep learning architectures: a feed-forward network, CNN, transformer, and BERT. Using the TweetEval dataset, we address class imbalance through focal loss and regularization techniques. Results show BERT achieves the highest overall performance due to it's pre-training advantage, while CNN demonstrates superior efficacy on rare emoji classes. This research shows the importance of architecture selection and hyperparameter tuning for sentiment-aware emoji prediction, contributing to improved human-computer interaction.
On-Device Emoji Classifier Trained with GPT-based Data Augmentation for a Mobile Keyboard
Amer, Hossam, Osborne, Joe, Zaki, Michael, Afify, Mohamed
Emojis improve communication quality among smart-phone users that use mobile keyboards to exchange text. To predict emojis for users based on input text, we should consider the on-device low memory and time constraints, ensure that the on-device emoji classifier covers a wide range of emoji classes even though the emoji dataset is typically imbalanced, and adapt the emoji classifier output to user favorites. This paper proposes an on-device emoji classifier based on MobileBert with reasonable memory and latency requirements for SwiftKey. To account for the data imbalance, we utilize the widely used GPT to generate one or more tags for each emoji class. For each emoji and corresponding tags, we merge the original set with GPT-generated sentences and label them with this emoji without human intervention to alleviate the data imbalance. At inference time, we interpolate the emoji output with the user history for emojis for better emoji classifications. Results show that the proposed on-device emoji classifier deployed for SwiftKey increases the accuracy performance of emoji prediction particularly on rare emojis and emoji engagement.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > Belgium (0.04)
- Asia > India (0.04)
- Information Technology > Communications (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)
Contrastive Learning-based Multi Modal Architecture for Emoticon Prediction by Employing Image-Text Pairs
Pandey, Ananya, Vishwakarma, Dinesh Kumar
The emoticons are symbolic representations that generally accompany the textual content to visually enhance or summarize the true intention of a written message. Although widely utilized in the realm of social media, the core semantics of these emoticons have not been extensively explored based on multiple modalities. Incorporating textual and visual information within a single message develops an advanced way of conveying information. Hence, this research aims to analyze the relationship among sentences, visuals, and emoticons. For an orderly exposition, this paper initially provides a detailed examination of the various techniques for extracting multimodal features, emphasizing the pros and cons of each method. Through conducting a comprehensive examination of several multimodal algorithms, with specific emphasis on the fusion approaches, we have proposed a novel contrastive learning based multimodal architecture. The proposed model employs the joint training of dual-branch encoder along with the contrastive learning to accurately map text and images into a common latent space. Our key finding is that by integrating the principle of contrastive learning with that of the other two branches yields superior results. The experimental results demonstrate that our suggested methodology surpasses existing multimodal approaches in terms of accuracy and robustness. The proposed model attained an accuracy of 91% and an MCC-score of 90% while assessing emoticons using the Multimodal-Twitter Emoticon dataset acquired from Twitter. We provide evidence that deep features acquired by contrastive learning are more efficient, suggesting that the proposed fusion technique also possesses strong generalisation capabilities for recognising emoticons across several modes.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (7 more...)
- Information Technology > Services (0.46)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
- Health & Medicine > Therapeutic Area > Immunology (0.46)
EmojiLM: Modeling the New Emoji Language
Peng, Letian, Wang, Zilong, Liu, Hang, Wang, Zihan, Shang, Jingbo
With the rapid development of the internet, online social media welcomes people with different backgrounds through its diverse content. The increasing usage of emoji becomes a noticeable trend thanks to emoji's rich information beyond cultural or linguistic borders. However, the current study on emojis is limited to single emoji prediction and there are limited data resources available for further study of the interesting linguistic phenomenon. To this end, we synthesize a large text-emoji parallel corpus, Text2Emoji, from a large language model. Based on the parallel corpus, we distill a sequence-to-sequence model, EmojiLM, which is specialized in the text-emoji bidirectional translation. Extensive experiments on public benchmarks and human evaluation demonstrate that our proposed model outperforms strong baselines and the parallel corpus benefits emoji-related downstream tasks.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Hawaii (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (3 more...)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Emoji Prediction in Tweets using BERT
Nusrat, Muhammad Osama, Habib, Zeeshan, Alam, Mehreen, Jamal, Saad Ahmed
In recent years, the use of emojis in social media has increased dramatically, making them an important element in understanding online communication. However, predicting the meaning of emojis in a given text is a challenging task due to their ambiguous nature. In this study, we propose a transformer-based approach for emoji prediction using BERT, a widely-used pre-trained language model. We fine-tuned BERT on a large corpus of text (tweets) containing both text and emojis to predict the most appropriate emoji for a given text. Our experimental results demonstrate that our approach outperforms several state-of-the-art models in predicting emojis with an accuracy of over 75 percent. This work has potential applications in natural language processing, sentiment analysis, and social media marketing.
Federated Learning Based Multilingual Emoji Prediction In Clean and Attack Scenarios
Gamal, Karim, Gaber, Ahmed, Amer, Hossam
Federated learning is a growing field in the machine learning community due to its decentralized and private design. Model training in federated learning is distributed over multiple clients giving access to lots of client data while maintaining privacy. Then, a server aggregates the training done on these multiple clients without access to their data, which could be emojis widely used in any social media service and instant messaging platforms to express users' sentiments. This paper proposes federated learning-based multilingual emoji prediction in both clean and attack scenarios. Emoji prediction data have been crawled from both Twitter and SemEval emoji datasets. This data is used to train and evaluate different transformer model sizes including a sparsely activated transformer with either the assumption of clean data in all clients or poisoned data via label flipping attack in some clients. Experimental results on these models show that federated learning in either clean or attacked scenarios performs similarly to centralized training in multilingual emoji prediction on seen and unseen languages under different data sources and distributions. Our trained transformers perform better than other techniques on the SemEval emoji dataset in addition to the privacy as well as distributed benefits of federated learning.
- North America > United States > Virginia (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- Asia > Middle East > Israel (0.04)
A Federated Approach to Predicting Emojis in Hindi Tweets
Gandhi, Deep, Mehta, Jash, Parekh, Nirali, Waghela, Karan, D'Mello, Lynette, Talat, Zeerak
The use of emojis affords a visual modality to, often private, textual communication. The task of predicting emojis however provides a challenge for machine learning as emoji use tends to cluster into the frequently used and the rarely used emojis. Much of the machine learning research on emoji use has focused on high resource languages and has conceptualised the task of predicting emojis around traditional server-side machine learning approaches. However, traditional machine learning approaches for private communication can introduce privacy concerns, as these approaches require all data to be transmitted to a central storage. In this paper, we seek to address the dual concerns of emphasising high resource languages for emoji prediction and risking the privacy of people's data. We introduce a new dataset of $118$k tweets (augmented from $25$k unique tweets) for emoji prediction in Hindi, and propose a modification to the federated learning algorithm, CausalFedGSD, which aims to strike a balance between model performance and user privacy. We show that our approach obtains comparative scores with more complex centralised models while reducing the amount of data required to optimise the models and minimising risks to user privacy.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > Alberta (0.14)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- (5 more...)
VoiceMoji: A Novel On-Device Pipeline for Seamless Emoji Insertion in Dictation
Kumar, Sumit, S, Harichandana B S, Arora, Himanshu
Most of the speech recognition systems recover only words in the speech and fail to capture emotions. Users have to manually add emoji(s) in text for adding tone and making communication fun. Though there is much work done on punctuation addition on transcribed speech, the area of emotion addition is untouched. In this paper, we propose a novel on-device pipeline to enrich the voice input experience. It involves, given a blob of transcribed text, intelligently processing and identifying structure where emoji insertion makes sense. Moreover, it includes semantic text analysis to predict emoji for each of the sub-parts for which we propose a novel architecture Attention-based Char Aware (ACA) LSTM which handles Out-Of-Vocabulary (OOV) words as well. All these tasks are executed completely on-device and hence can aid on-device dictation systems. To the best of our knowledge, this is the first work that shows how to add emoji(s) in the transcribed text. We demonstrate that our components achieve comparable results to previous neural approaches for punctuation addition and emoji prediction with 80% fewer parameters. Overall, our proposed model has a very small memory footprint of a mere 4MB to suit on-device deployment.
Apple unfurls more millennial-friendly texting tools including 'emoji prediction'
Apple, known for its steady stream of slick consumer electronic devices, used its annual developer conference in San Francisco to roll out a raft of millennial-friendly texting tools to enhance emojis, image sharing and add animations to messages. Among a two-hour stream of product announcements at the annual Worldwide Developer Conference (WWDC) event, Apple engineers demonstrated the latest update of Apple's smartphone software iOS, which will now let iPhone users add larger emojis, see photos and videos appear in a stream of text messages, add animated effects and "emojify" messages by converting typed words into emoji. Opening the event with a moment of silence for the victims of the weekend's shooting at a gay night club in Orlando, CEO Tim Cook – who has become a leader on gay rights issues since talking about his own sexuality in 2014 – called the attacks a "senseless unconscionable act of terrorism and hate aimed at dividing and destroying". Cook then set about laying out his vision for a future where Apple's software forms the central hub of its customers' lives, helping track their fitness, send love notes, navigate the road and trade pictures of cute dogs. Apple is clearly responding to the voice of the consumer; messages is the most popular app on iOS, and the new features are designed to offer more playful options that replicate some successful third party messaging apps. "We're providing emoji predictions as you type," said Craig Federighi, Apple's senior vice president of software engineering.
- Telecommunications (0.91)
- Media (0.69)
- Information Technology > Services (0.49)